Performance Evaluation of Some Inverse Iteration Algorithms on PowerXCell 8i Processor
نویسندگان
چکیده
In this paper, we compare with the inverse iteration algorithms on PowerXCell 8i processor, which has been known as a heterogeneous environment. When some of all the eigenvalues are close together or there are clusters of eigenvalues, reorthogonalization must be adopted to all the eigenvectors associated with such eigenvalues. Reorthogonalization algorithms need a lot of computational cost. The Classical Gram-Schmidt (CGS) algorithm, the modified Gram-Schmidt (MGS) algorithm, and the Householder orthogonalization algorithm in terms of the compact WY representation have been known as reorthogonalization algorithms. These algorithms can be computed using BLAS level-1 and level-2. Since synergistic processor elements in PowerXCell 8i processor archive the high performance of BLAS level-2 and level-3, the orthogonalization algorithms except the MGS algorithm can be computed high-speed on parallel computers.
منابع مشابه
0 3 9 Status of the QPACE Project
We give an overview of the QPACE project, which is pursuing the development of a massively parallel, scalable supercomputer for LQCD. The machine is a three-dimensional torus of identical processing nodes, based on the PowerXCell 8i processor. The nodes are connected by an FPGA-based, application-optimized network processor attached to the PowerXCell 8i processor. We present a performance analy...
متن کاملAn MPI Performance Monitoring Interface for Cell Based Compute Nodes
In this paper, we present a methodology for profiling parallel applications executing on the family of architectures commonly referred as the “Cell” processor. Specifically, we examine Cell-centric MPI programs on hybrid clusters containing multiple Opteron and IBM PowerXCell 8i processors per node such as those used in the petascale Roadrunner system. We analyze the performance of our approach...
متن کاملLattice QCD Applications on QPACE
QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPACE node is based on the IBM PowerXCell 8i processor. The nodes are interconnected by a custom 3-dimensional torus network implemented on an FPGA. The compute power of the processor is provided by 8 Synergistic Processing Units. Making efficient use of these accelerator cores in scientific applica...
متن کاملSelf-Organizing Maps on the Cell Broadband Engine Architecture
We present and evaluate novel parallel implementations of Self-Organizing Maps for the Cell Broadband Engine Architecture. Motivated by the interactive nature of the datamining process, we evaluate the scalability of the implementations on two clusters using different network characteristics and incarnations (PS3console and PowerXCell 8i) of the architecture. Our implementations use varying com...
متن کاملImplementing 3D SPHARM Surfaces Registration on Cell Processor
Spherical harmonics (SPHARM) description is a highly promising surface-based morphometry (SBM) method and has been widely used in neuroimaging applications to model the surface of arbitrarily shaped but simply connected 3D objects. This paper focuses on SHREC, a recently developed generalpurpose surface-matching method for 3D SPHARM registration. We implemented SHREC in MATLAB, then optimized a...
متن کامل